Coincidences And Entropies of Random Variables with Very Large Alphabets
نویسنده
چکیده
We examine the recently introduced NSB estimator of entropies of severely undersampled discrete variables and devise a procedure for calculating the involved integrals. We discover that the output of the estimator has a well defined limit for large cardinalities of the variables being studied. Thus one can estimate entropies with no a priori assumptions about these cardinalities, and a closed form solution for such estimates is given.
منابع مشابه
Coincidences and Estimation of Entropies of Random Variables with Large Cardinalities
We perform an asymptotic analysis of the NSB estimator of entropy of a discrete random variable. The analysis illuminates the dependence of the estimates on the number of coincidences in the sample and shows that the estimator has a well defined limit for a large cardinality of the studied variable. This allows estimation of entropy with no a priori assumptions about the cardinality. Software i...
متن کاملIRWIN AND JOAN JACOBS CENTER FOR COMMUNICATION AND INFORMATION TECHNOLOGIES Entropy Bounds for Discrete Random Varibles via Coupling
This paper derives new entropy bounds for discrete random variables via maximal coupling. It provides bounds on the difference between the entropies of two discrete random variables in terms of the local and total variation distances between their probability mass functions. These bounds address cases of finite or countable infinite alphabets. Particular cases of these bounds reproduce some kno...
متن کاملInference of Entropies of Discrete Random Variables with Unknown Cardinalities
We examine the recently introduced NSB estimator of entropies of severely undersampled discrete variables and devise a procedure for calculating the involved integrals. We discover that the output of the estimator has a well defined limit for large cardinalities of the variables being studied. Thus one can estimate entropies with no a priori assumptions about these cardinalities, and a closed f...
متن کاملCharacterizations Using Entropies of Records in a Geometric Random Record Model
Suppose that a geometrically distributed number of observations are available from an absolutely continuous distribution function $F$, within this set of observations denote the random number of records by $M$. This is called geometric random record model. In this paper, characterizations of $F$ are provided in terms of the subsequences entropies of records conditional on events ${M geq n}$ or ...
متن کاملOf fishes and birthdays: Efficient estimation of polymer configurational entropies
We present an algorithm to estimate the configurational entropy S of a polymer. The algorithm uses the statistics of coincidences among random samples of configurations and is related to the catch-tag-release method for estimation of population sizes, and to the classic “birthday paradox”. Bias in the entropy estimation is decreased by grouping configurations in nearly equiprobable partitions b...
متن کامل